"Computer vision is an interdisciplinary field that deals with how computers can be made for gaining high-level understanding from digital images or videos. From the perspective of engineering, it seeks to automate tasks that the human visual system can do." ˆHuang, T. (1996-11-19). Vandoni, Carlo, E, ed. Computer Vision : Evolution And Promise
"Computer vision is concerned with the automatic extraction, analysis and understanding of useful information from a single image or a sequence of images. It involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding." ˆhttp://www.bmva.org/visionoverview The British Machine Vision Association and Society for Pattern Recognition Retrieved February 20, 2017
Source : https://medium.com/readers-writers-digest/beginners-guide-to-computer-vision-23606224b720

ˆ Szeliski R. (2010). Computer Vision: Algorithms and Applications. Springer

Classification in ImageNet : $$e = \frac 1 n \sum_{k} min_{j} d(l_{j}, g_{k}) \tag{1}$$
The image classification pipeline :



Paper : https://arxiv.org/abs/1512.03385
"In object detection it is possible to detect more than 1 object"

import io
import base64
from IPython.display import HTML
video = io.open('./media/CV/MOVIE1.MOV', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<center><video alt="test" controls style="width: 300px;">
<source src="data:video/mp4;base64,{0}" type="video/mp4" />
</video></center>'''.format(encoded.decode('ascii')))